TRAP-TANDEM: data-driven extraction of temporal features from speech - Automatic Speech Recognition and Understanding, 2003. ASRU '03. 2003 IEEE Workshop on

نویسنده

  • Hynek Hermansky
چکیده

Conventional features in automatic recognition of speech describe instantaneous shape of a short-term spectrum of speech. The TRAP-TANDEM features describe likelihood of sub-word classes at a given time instant, derived from temporal trajectories of band-limited spectral densities in the vicinity of the given instant. The paper presents some rationale behind the data-driven TRAP-TANDEM approach, briefly describes the technique, points to relevant publications and summarizes results achieved so far.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Trap-tandem: Data-driven Extraction of Temporal Features from Speech

Conventional features in automatic recognition of speech describe instantaneous shape of a short-term spectrum of speech. The TRAP-TANDEM features describe likelihood of sub-word classes at a given time instant, derived from temporal trajectories of band-limited spectral densities in the vicinity of the given instant. The paper presents some rationale behind the data-driven TRAP-TANDEM approach...

متن کامل

Results from a survey of attendees at ASRU 1997 and 2003

In 1997 the author conducted a survey at the IEEE workshop on ‘Automatic Speech Recognition and Understanding’ (ASRU) in which attendees were offered a set of twelve putative future events to which they were asked to assign a date. Six years later at ASRU’2003, the author repeated the survey with the addition of eight additional items. This paper presents the combined results from both surveys.

متن کامل

Progress and Prospects for Speech Technology: Results from Three Sexennial Surveys

In 1997, and again in 2003, the author was invited to conduct a survey at the IEEE workshop on ‘Automatic Speech Recognition and Understanding’ (ASRU) in which attendees were offered a set of statements about putative future events relating to progress in various aspects of speech technology R&D. The task of the respondents was to assign a date to each possible event. The 1997 and 2003 results ...

متن کامل

Speech Emotion Recognition Using Scalogram Based Deep Structure

Speech Emotion Recognition (SER) is an important part of speech-based Human-Computer Interface (HCI) applications. Previous SER methods rely on the extraction of features and training an appropriate classifier. However, most of those features can be affected by emotionally irrelevant factors such as gender, speaking styles and environment. Here, an SER method has been proposed based on a concat...

متن کامل

A spoken dialogue system for conference/workshop services

This paper describes our progress towards building a telephony-based spoken dialogue system for workshop/conference services. A mixed-initiative dialogue system has been developed that is engineered to o er users natural interaction with the system, ease-of-use and robustness towards ambiguous requests and machine errors. A prototype system, known as W99, is described in this paper which was de...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004